Graph or Relational Databases: A Speed Comparison for Process Mining Algorithm

نویسندگان

  • Jeevan Joishi
  • Ashish Sureka
چکیده

Process-Aware Information System (PAIS) are IT systems that manages, supports business processes and generate large event logs from execution of business processes. An event log is represented as a tuple of the form CaseID, TimeStamp, Activity and Actor. Process Mining is an emerging area of research that deals with the study and analysis of business processes based on event logs. Process Mining aims at analyzing event logs and discover business process models, enhance them or check for conformance with an a priori model. The large volume of event logs generated are stored in databases. Relational databases perform well for certain class of applications. However, there are certain class of applications for which relational databases are not able to scale. A number of NoSQL databases have emerged to encounter the challenges of scalability. Discovering social network from event logs is one of the most challenging and important Process Mining task. Similar-Task and Sub-Contract algorithms are some of the most widely used Organizational Mining techniques. Our objective is to investigate which of the databases (Relational or Graph) perform better for Organizational Mining under Process Mining. An intersection of Process Mining and Graph Databases can be accomplished by modelling these Organizational Mining metrics with graph databases. We implement Similar-Task and Sub-Contract algorithms on relational and NoSQL (graph-oriented) databases using only query language constructs. We conduct empirical analysis on a large real world data set to compare the performance of row-oriented database and NoSQL graph-oriented database. We benchmark performance factors like query execution time, CPU usage and disk/memory space usage for NoSQL graph-oriented database against row-oriented database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Mining Functional Dependency from Relational Databases Using Equivalent Classes and Minimal Cover

Data Mining (DM) represents the process of extracting interesting and previously unknown knowledge from data. This study proposes a new algorithm called FD_Discover for discovering Functional Dependencies (FDs) from databases. This algorithm employs some concepts from relational databases design theory specifically the concepts of equivalences and the minimal cover. It has resulted in large imp...

متن کامل

Dissimilar friction stir lap welding of Al-Mg to CuZn34: Application of grey relational analysis for optimizing process parameters

This study focused on the optimization of Al—Mg to CuZn34 friction stir lap welding (FSLW) process for optimal combination of rotational and traverse speeds in order to yield favorable fracture load using Grey relational analysis (GRA). First, the degree of freedom was calculated for the system. Then, the experiments based on the target values and number of considered levels, corresponding orth...

متن کامل

Empirical Analysis on Comparing the Performance of Alpha Miner Algorithm in SQL Query Language and NoSQL Column-Oriented Databases Using Apache Phoenix

Process-Aware Information Systems (PAIS) is an IT system that support business processes and generate large amounts of event logs from the execution of business processes. An event log is represented as a tuple of CaseID, Timestamp, Activity and Actor. Process Mining is a new and emerging field that aims at analyzing the event logs to discover, enhance and improve business processes and check c...

متن کامل

OO-FSG: An Object-Oriented Approach to Mine Frequent Subgraphs

Frequent subgraph mining (FSG) has always been an important issue in data mining. Several frequent subgraph mining methods have been developed for mining graph data. However, most of these are main memory algorithms in which scalability is a bigger issue. A few algorithms have opted for a relational approach that stores the graph data in relational tables. However, relational databases have the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1701.00072  شماره 

صفحات  -

تاریخ انتشار 2016